AITopics

Sun, Mingyang, Ding, Pengxiang, Zhang, Weinan, Wang, Donglin

Iterative Refinement of Flow Policies in Probability Space for Online Reinforcement Learning

arXiv.org Artificial IntelligenceOct-20-2025

While behavior cloning with flow/diffusion policies excels at learning complex skills from demonstrations, it remains vulnerable to distributional shift, and standard RL methods struggle to fine-tune these models due to their iterative inference process and the limitations of existing workarounds. In this work, we introduce the Stepwise Flow Policy (SWFP) framework, founded on the key insight that discretizing the flow matching inference process via a fixed-step Euler scheme inherently aligns it with the variational Jordan-Kinderlehrer-Otto (JKO) principle from optimal transport. SWFP decomposes the global flow into a sequence of small, incremental transformations between proximate distributions. Each step corresponds to a JKO update, regularizing policy changes to stay near the previous iterate and ensuring stable online adaptation with entropic regularization. This decomposition yields an efficient algorithm that fine-tunes pre-trained flows via a cascade of small flow blocks, offering significant advantages: simpler/faster training of sub-models, reduced computational/memory costs, and provable stability grounded in Wasserstein trust regions. Comprehensive experiments demonstrate SWFP's enhanced stability, efficiency, and superior adaptation performance across diverse robotic control benchmarks.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

2510.15388

Country:

Europe > Austria (0.28)
Asia > Middle East > Jordan (0.25)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)

Neural Information Processing SystemsJan-19-2025, 15:57:49 GMT

Normalizing flow neural networks by JKO scheme

jko scheme, normalizing flow neural network, residual block, (2 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.29)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.75)

arXiv.org Machine LearningJan-13-2025

Computational and Statistical Asymptotic Analysis of the JKO Scheme for Iterative Algorithms to update distributions

Wu, Shang, Wang, Yazhen

The seminal work of Jordan, Kinderlehrer, and Otto [33] developed what is now widely known as the JKO scheme, a foundational method for generating iterative algorithms to compute distributions and reshaping our understanding of sampling algorithms. The JKO scheme can be interpreted as a gradient flow of the free energy with respect to the Wasserstein metric, often referred to as the Wasserstein gradient flow. This interpretation has led to significant advancements in machine learning, including applications in reinforcement learning to solve policy-distribution optimization problems [55]. While the JKO scheme traditionally assumes that the underlying model is fully known, in this paper, we relax this assumption by allowing models with unknown parameters. We develop statistical approaches to estimate these parameters and adapt the JKO scheme to work with the estimated values. Specifically, Langevin equations--stochastic differential equations--play a key role in describing the evolution of physical systems, facilitating stochastic gradient descent in machine learning, and enabling Markov chain Monte Carlo (MCMC) simulations in numerical computing. For examples and detailed discussions, see [11, 8, 51, 22, 19, 39, 43]. Solutions to Langevin equations, known as Langevin diffusions, are stochastic processes whose distributions evolve according to the Fokker-Planck equations [27, 48].

converge, equation, jko scheme, (15 more...)

2501.06408

Country:

Asia > Middle East > Jordan (0.24)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Michigan (0.04)
Asia > China (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

arXiv.org Artificial IntelligenceSep-18-2024

JKO for Landau: a variational particle method for homogeneous Landau equation

Huang, Yan, Wang, Li

Inspired by the gradient flow viewpoint of the Landau equation and corresponding dynamic formulation of the Landau metric in [arXiv:2007.08591], we develop a novel implicit particle method for the Landau equation in the framework of the JKO scheme. We first reformulate the Landau metric in a computationally friendly form, and then translate it into the Lagrangian viewpoint using the flow map. A key observation is that, while the flow map evolves according to a rather complicated integral equation, the unknown component is merely a score function of the corresponding density plus an additional term in the null space of the collision kernel. This insight guides us in approximating the flow map with a neural network and simplifies the training. Additionally, the objective function is in a double summation form, making it highly suitable for stochastic methods. Consequently, we design a tailored version of stochastic gradient descent that maintains particle interactions and reduces the computational complexity. Compared to other deterministic particle methods, the proposed method enjoys exact entropy dissipation and unconditional stability, therefore making it suitable for large-scale plasma simulations over extended time periods.

equation, landau equation, particle method, (15 more...)

2409.12296

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Hertrich, Johannes, Gruhlke, Robert

Importance Corrected Neural JKO Sampling

arXiv.org Machine LearningJul-29-2024

In order to sample from an unnormalized probability density function, we propose to combine continuous normalizing flows (CNFs) with rejection-resampling steps based on importance weights. We relate the iterative training of CNFs with regularized velocity fields to a JKO scheme and prove convergence of the involved velocity fields to the velocity field of the Wasserstein gradient flow (WGF). The alternation of local flow steps and non-local rejection-resampling steps allows to overcome local minima or slow convergence of the WGF for multimodal distributions. Since the proposal of the rejection step is generated by the model itself, they do not suffer from common drawbacks of classical rejection schemes. The arising model can be trained iteratively, reduces the reverse Kulback-Leibler (KL) loss function in each step, allows to generate iid samples and moreover allows for evaluations of the generated underlying density. Numerical examples show that our method yields accurate results on various test distributions including high-dimensional multimodal targets and outperforms the state of the art in almost all cases significantly.

international conference, rejection step, wasserstein gradient flow, (13 more...)

2407.20444

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Choi, Jaemoo, Choi, Jaewoong, Kang, Myungjoo

Scalable Wasserstein Gradient Flow for Generative Modeling through Unbalanced Optimal Transport

arXiv.org Artificial IntelligenceFeb-8-2024

Wasserstein Gradient Flow (WGF) describes the gradient dynamics of probability density within the Wasserstein space. WGF provides a promising approach for conducting optimization over the probability distributions. Numerically approximating the continuous WGF requires the time discretization method. The most well-known method for this is the JKO scheme. In this regard, previous WGF models employ the JKO scheme and parametrize transport map for each JKO step. However, this approach results in quadratic training complexity $O(K^2)$ with the number of JKO step $K$. This severely limits the scalability of WGF models. In this paper, we introduce a scalable WGF-based generative model, called Semi-dual JKO (S-JKO). Our model is based on the semi-dual form of the JKO step, derived from the equivalence between the JKO step and the Unbalanced Optimal Transport. Our approach reduces the training complexity to $O(K)$. We demonstrate that our model significantly outperforms existing WGF-based generative models, achieving FID scores of 2.62 on CIFAR-10 and 6.19 on CelebA-HQ-256, which are comparable to state-of-the-art image generative models.

jko step, scalable wasserstein gradient flow, wasserstein gradient flow, (12 more...)

2402.05443

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > France (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningOct-26-2023

Convergence of flow-based generative models via proximal gradient descent in Wasserstein space

Cheng, Xiuyuan, Lu, Jianfeng, Tan, Yixin, Xie, Yao

Flow-based generative models enjoy certain advantages in computing the data generation and the likelihood, and have recently shown competitive empirical performance. Compared to the accumulating theoretical studies on related score-based diffusion models, analysis of flow-based models, which are deterministic in both forward (data-to-noise) and reverse (noise-to-data) directions, remain sparse. In this paper, we provide a theoretical guarantee of generating data distribution by a progressive flow model, the so-called JKO flow model, which implements the Jordan-Kinderleherer-Otto (JKO) scheme in a normalizing flow network. Leveraging the exponential convergence of the proximal gradient descent (GD) in Wasserstein space, we prove the Kullback-Leibler (KL) guarantee of data generation by a JKO flow model to be $O(\varepsilon^2)$ when using $N \lesssim \log (1/\varepsilon)$ many JKO steps ($N$ Residual Blocks in the flow) where $\varepsilon $ is the error in the per-step first-order condition. The assumption on data density is merely a finite second moment, and the theory extends to data distributions without density and when there are inversion errors in the reverse process where we obtain KL-$W_2$ mixed error guarantees. The non-asymptotic convergence rate of the JKO-type $W_2$-proximal GD is proved for a general class of convex objective functionals that includes the KL divergence as a special case, which can be of independent interest.

artificial intelligence, flow model, machine learning, (18 more...)

2310.17582

Country:

Asia > Middle East > Jordan (0.24)
North America > United States (0.14)

Genre:

Overview (0.67)
Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Fan, Jiaojiao, Zhang, Qinsheng, Taghvaei, Amirhossein, Chen, Yongxin

Variational Wasserstein gradient flow

arXiv.org Artificial IntelligenceJul-24-2022

Wasserstein gradient flow has emerged as a promising approach to solve optimization problems over the space of probability distributions. A recent trend is to use the well-known JKO scheme in combination with input convex neural networks to numerically implement the proximal step. The most challenging step, in this setup, is to evaluate functions involving density explicitly, such as entropy, in terms of samples. This paper builds on the recent works with a slight but crucial difference: we propose to utilize a variational formulation of the objective function formulated as maximization over a parametric class of functions. Theoretically, the proposed variational formulation allows the construction of gradient flows directly for empirical distributions with a well-defined and meaningful objective function. Computationally, this approach replaces the computationally expensive step in existing methods, to handle objective functions involving density, with inner loop updates that only require a small batch of samples and scale well with the dimension. The performance and scalability of the proposed method are illustrated with the aid of several numerical experiments involving high-dimensional synthetic and real datasets.

divergence, gradient flow, wasserstein gradient flow, (15 more...)

2112.02424

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Alvarez-Melis, David, Schiff, Yair, Mroueh, Youssef

Optimizing Functionals on the Space of Probabilities with Input Convex Neural Networks

arXiv.org Machine LearningJun-14-2021

Gradient flows are a powerful tool for optimizing functionals in general metric spaces, including the space of probabilities endowed with the Wasserstein metric. A typical approach to solving this optimization problem relies on its connection to the dynamic formulation of optimal transport and the celebrated Jordan-Kinderlehrer-Otto (JKO) scheme. However, this formulation involves optimization over convex functions, which is challenging, especially in high dimensions. In this work, we propose an approach that relies on the recently introduced input-convex neural networks (ICNN) to parameterize the space of convex functions in order to approximate the JKO scheme, as well as in designing functionals over measures that enjoy convergence guarantees. We derive a computationally efficient implementation of this JKO-ICNN framework and use various experiments to demonstrate its feasibility and validity in approximating solutions of low-dimensional partial differential equations with known solutions. We also explore the use of our JKO-ICNN approach in high dimensions with an experiment in controlled generation for molecular discovery.

deep learning, experiment, upstream oil & gas, (18 more...)

2106.00774

Country:

Asia > Middle East > Jordan (0.24)
North America > United States (0.14)
Europe > Switzerland (0.14)
Europe > France (0.14)

Genre: Research Report (0.82)

Industry:

Energy > Oil & Gas > Upstream (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Health & Medicine > Therapeutic Area > Immunology (0.46)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)